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Abstract 


Acoustic emission (AE) data were acquired during fatigue testing of an alumi- 
num 2024-T4 compact tension specimen using a commercially available AE sys- 
tem. AE signals from crack extension were identified and separated from noise 
spikes , signals that reflected from the specimen edges , and signals that saturated 
the instrumentation. A commercially available software package was used to train 
a statistical pattern recognition system to classify the signals. The software 
trained a network to recognize signals with a 91-percent accuracy when com- 
pared with the researcher's interpretation of the data. Reasons for the discrepan- 
cies are examined and it is postulated that additional preprocessing of the AE 
data to focus on the extensional wave mode and eliminate other effects before 
training the pattern recognition system will result in increased accuracy. 


Introduction 

Acoustic emission (AE) is defined as “the class of 
phenomena whereby transient elastic waves are gener- 
ated by the rapid release of energy from localized 
sources within a material (or structure) or the transient 
waves so generated” (ref. 1). Acoustic emission can be 
generated by a variety of sources, including crack 
nucleation and propagation, multiple dislocation slip, 
twinning, grain boundary sliding, Barkhausen effect 
(realignment or growth of magnetic domains), phase 
transformations, and debonding and fracture of inclu- 
sion. Acoustic emission can also be generated by 
sources other than materials under stress, such as com- 
ponents rubbing against one another (fretting), leaks, 
structural vibrations, electrical transients. Spanner 
(ref. 2) and Williams (ref. 3) have provided discus- 
sions of sources of acoustic emission in a variety of 
materials and applications. Effective use of acoustic 
emission for monitoring damage progression in struc- 
tures requires interpretation of the AE signals to deter- 
mine the sources of the AE, their locations, and their 
severity. An experienced AE practitioner can learn to 
recognize signals from different sources, but always 
uncertainty about some of the data exists. Current AE 
systems, such as the one used in this study, can record 
up to 200 waveforms per second. Pattern recognition 
algorithms exist for training computers to recognize 
and interpret the signals. The objective of this project 
was to investigate the applicability of statistical pat- 
tern recognition to the identification of crack signals in 
a well-controlled test with limited sources of acoustic 
emission as a prelude to a possible application to mon- 
itoring crack growth in aging aircraft. The initial 
approach was to use a commercially available soft- 


ware package to extract features from the acoustic 
emission signals and perform the pattern recognition. 

Pattern recognition methods require that a network 
first be trained to recognize signals; this is also called 
learning. A set of signals representing the different 
classes of data to be learned are provided as inputs to 
the network along with their classes. The network ana- 
lyzes the differences between the signals and deter- 
mines which characteristics best define each class of 
data. It compares its calculations with the known 
classes of the signals provided by the user. Where 
there is ambiguity, or disagreement with the classes 
provided, there is training error. The network can con- 
tinue to refine its analysis to minimize the training 
error. Once the training error is minimized, the learn- 
ing is complete and one or more classifiers are devel- 
oped. These classifiers may be developed with the 
same technique used in the learning phase, or different 
techniques may be used. 

The second phase of pattern recognition is classi- 
fication. New signals are input to the network and ana- 
lyzed by using the classifiers developed in the learning 
stage. The network does not know the classes of these 
signals but determines their classes based upon the 
classifiers. If several classifiers are used, they may not 
all agree on the classes of all the signals. If the user 
knows the classes of the signals, he may evaluate the 
results of the classification based upon his knowledge 
of the signals. Any discrepancies between the classifi- 
ers and the user's knowledge are classification errors. 

In this work, a k-nearest neighbor algorithm was 
used in the learning phase, and the training error was 



calculated and minimized. Classifiers were developed 
for the data by using k-nearest neighbor, Gaussian 
probability density, and Fisher linear discriminant 
methods. A detailed description of statistical pattern 
recognition and these classifiers is found in 
appendix A. 

TestPro software by Infometrics, Inc., was used to 
perform the pattern recognition analysis. The software 
is part of a computer-based instrument for ultrasonic 
and eddy-current inspection and was developed spe- 
cifically for those applications. The feature extraction 
module is particularly tailored to the analysis of these 
signals and not to acoustic emission signals. The sta- 
tistical pattern recognition methods used, however, are 
generic and applicable to many problems in signal 
classification. Hinton (ref. 4) previously used this soft- 
ware to classify and recognize acousto-ultrasonic sig- 
nals from defects in composite panels. In this 
composite panel study, five sets of panels, each with 
different model defects of varying severity, were 
examined and the data classified with TestPro soft- 
ware, with zero training error for four sets and 2 per- 
cent training error for the fifth set. The software was 
used in this study to determine its applicability to the 
classification of acoustic emission signals. The soft- 
ware is described in appendix B. 

Experimental Procedure 

A 2024-T4 aluminum compact tension specimen 
was tested in tension-tension fatigue. The specimen 
was a variation of that specified in reference 5. The 
specimen was approximately 21.24 cm (6 in.) square 
and 0.32 cm (1/8 in.) thick, with a straight-through 
notch of 6.35 cm (2.5 in.). The notch introduces a 
stress concentration that initiates crack growth under 
cyclic loading. The initial maximum and minimum 
loads were 3314 and 823 N (745 and 185 lb), respec- 
tively (load ratio R = 0.248). Four Digital Wave 
B1025 AE sensors were mounted on the specimen, as 
shown in figure 1, with silicone grease couplant and 
held on with C-clamps. These sensors have an ampli- 
tude response of ±15 dB and a phase response of 
±3 in the range from 0.1 to 1 MHz, as shown in 
figure 2. The sensor output was amplified 40 dB by 
Digital Wave PA2040 G/A preamplifiers, then digi- 
tized and stored with a Digital Wave F4000 FWD AE 
analysis system. The AE system includes high and low 
pass filters and amplifiers on each channel, one of 



Figure 1 . 2024-T4 aluminum fatigue specimen with four 
acoustic emission sensors. 



Frequency, MHz 

Figure 2. Absolute calibration of sensor, sensitivity, and 
phase, using laser interferometer to measure surface dis- 
placement, traceable to National Institute of Standards and 
Technology. 

each for triggering and one of each for the data. The 
data channels were set to 0.02 MHz high pass and 
1.5 MHz low pass filters and 30 dB gain. The trigger 
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circuitry was set to 0.3 MHz high pass and 1.0 MHz 
low pass filter, 36 dB gain, and 0.1 -V threshold. The 
system triggers when the signal on any channel 
exceeds the threshold and then records data on all four 
channels. The system recorded 2048 points per wave- 
form at 30 MHz sampling rate (0.033 psec/point) with 
25 percent pretrigger (512 points, 17.067 psec pretrig- 
ger; 1536 points, 51.2 psec posttrigger). The specimen 
was cycled at 1 Hz until a crack was visible to the 
naked eye. At that point the AE data acquisition 
began. A load gate was used during part of the test to 
allow the system to acquire AE data only during the 
highest 20 percent of the load, which is when crack 
extension is expected to occur. This reduces the 
amount of data from other sources such as crack face 
rubbing, which cannot occur when the crack opening 
load is exceeded. Figure 3 is a schematic of the test 
setup that shows the fatigue specimen with four sen- 
sors and preamplifiers and acoustic emission data 
acquisition system. 



Acoustic emission system 


Figure 3. Schematic of test setup. 

Analysis and Discussion 

Two classes of signals were initially identified for 
training: cracks and noise. A typical crack signal is 
shown in figure 4 as received at all four sensors 
mounted on surface of fatigue specimen used during 
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Figure 4. Typical crack signal as received at each of four sensors mounted on surface of fatigue specimen. 
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the test. The first 17 (isec of each signal is prior to the 
system being triggered. In this example, the signal first 
exceeds the 0.1 -V threshold on channel 2. Channels 1 
and 2 both show a rise time to peak amplitude within 
the next 2 to 3 ^sec, and a decaying amplitude thereaf- 
ter. The first one or two cycles of these signals are of 
lower frequency, followed by some higher frequency 
arrivals, an artifact of the extensional S 0 mode disper- 
sion curves that have the very high frequencies travel- 
ing at a lower velocity than the earlier nondispersive 
low frequency modes. This signal appears to be a pure 
extensional mode wave with no flexural modes 
present, as expected for a through-thickness fatigue 
crack source, as discussed by Gorman (ref. 6). Signals 
resembling those shown in figure 4 were classified as 
crack signals; all others were grouped into the class of 
noise signals. Forty signals representative of cracks 
and 64 signals representative of noise signals were 
used to train a 6-nearest neighbor system. These sig- 
nals were acquired with maximum and minimum 
loads of 2478 and 1757 N (557 and 395 lb). The soft- 
ware reported a training error of 0 percent. Fisher, 
Gaussian, and 3-nearest neighbor classifiers were 
developed, with reported classification errors of 6.7, 
1.9, and 0 percent in classifying the training data. An 
additional 752 signals, acquired with load cycling 
from 3314 to 823 N (745 to 185 lb) and without the 
load gate, were then analyzed by each of the classifi- 
ers. Of these 752 signals, 276 showed characteristics 
of crack signals. The Fisher classifier reported 420 
crack signals, the Gaussian classifier reported 604 
crack signals, and the 3-nearest neighbor classifier 
reported 620 crack signals representing classification 
errors of at least 19 to 45 percent. Based on statistical 
pattern recognition concepts (ref. 7), these large dis- 
crepancies clearly indicate that the training set was not 
a good representation of the remaining data. Because 
these data were acquired without the use of a load 
gate, additional signals were likely acquired from 
other sources, for example, crack face rubbing and pin 
noise, that were not included in the training data. The 
class of noise signals was, therefore, redefined to 
accommodate some of these other sources. 

After examining the 752 signals used for analysis, 
four classes of signals were identified: cracks, reflec- 
tions, saturation, and spikes. Examples of these signals 
are shown in figure 5. The signals classified as reflec- 
tions have significant oscillations during the pre- 
trigger period. This type of signal is indicative of one 


that was reflected from the specimen edges and trig- 
gered the AE system to acquire new data as though 
from a separate signal. The saturation class comprises 
signals that saturated the electronics and were clipped. 
The spikes were very sharp, very short duration sig- 
nals, typically of 1 to 2 |xsec, which were believed to 
come from electrical noise. Training sets of 40 crack, 
44 reflection, 40 saturation, and 20 spike signals were 
used to train the pattern recognition system. The mini- 
mum training error achieved for the 4-nearest neigh- 
bor algorithm was 9.5 percent. The Gaussian, Fisher, 
3-, 4-, and 5-nearest neighbor classifiers were devel- 
oped to analyze the additional data. The analysis 
resulted in classification errors of 5, 18, 10, 15, and 
10 percent, which shows a significant increase in clas- 
sification error over the case of two classes, cracks and 
noise. However, only one of the 40 training signals 
from cracks was improperly classified. 

To evaluate the accuracy of the discriminant func- 
tions derived by the software, 564 signals, represent- 
ing 141 events on each of 4 channels, were then 
analyzed by using each of the classifiers, and the 
results were compared with a personal evaluation of 
the unknown signals. The single Gaussian classifier 
resulted in the lowest classification error, with 8 of 59 
(14 percent) crack signals wrongly identified as 
belonging to one of the other classes, and 8 of 91 
(9 percent) signals which belong to other classes 
wrongly identified as cracks. The remaining signals 
did not appear to belong to any of the defined classes 
based on the characteristics described previously; 
therefore, they were not included in the analysis. 
Although the training errors using four classes are 
much higher than those using two classes, the actual 
classification of the additional waveforms showed 
improvement from errors in the 19- to 45-percent 
range with two classes, to about 1 0 percent in this case 
(16 of 150 signals). This error was, however, judged 
still to be unacceptably high, based on prior experi- 
ence with this software (ref. 4). Therefore, an effort 
was made to further refine the definitions of the train- 
ing sets. Because only one crack signal in the training 
set was wrongly classified, the noise signals were 
examined in an attempt to improve their representation 
in the training set. 

Upon reexamination of the data, a fifth class of 
signals was identified. These signals are lower in 
frequency than the crack signals, suggesting an 
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Figure 5. Representative signals from each of four classes: cracks, reflections, saturation, and spikes. 


out-of-plane source motion or flexural wave, as 
discussed by Gorman and Prosser (ref. 8). They appear 
to occur at lower loads and may be indicative of crack 
face rubbing or pin loading noises. This fifth class was 
added to the training set, and the system trained again 
using the 4-nearest neighbor algorithm. The training 
error rose from 9 to 15 percent. The classification 
errors in identifying crack signals rose from 1 of 40 
to 5 of 40; the remaining errors were in the other four 
classes. 

Peak amplitude and peak-to-peak amplitude of 
acoustic emission signals are not effective means of 
identifying sources because signal amplitudes are 
greatly affected by attenuation. In figure 4, for exam- 
ple, the amplitude of the signals changes significantly 
for sensors at different distances from the crack, where 
propagation distances are only a few centimeters at 
most. Attenuation is even more significant when geo- 
metric spreading is dominant, when the wave modes 
are highly dispersive (as is the case with flexural 
waves), and in highly attenuating materials such as 
composites. Nevertheless, the decision was made to 


add the amplitude features to the training set to deter- 
mine if they would help further identify signals from 
each of the classes. The training process was then 
repeated for four and five training data sets. For four 
classes of data, the reduction in training error, from 9 
to 7 percent, was insignificant; with five classes of 
data, these amplitudes had no effect on the training 
error. 

According to Fukunaga (ref. 7), the training error 
and classification errors could indicate one or more of 
several problems: 

1. The training set is not representative of the 
analysis data 

2. The training set is too small, not indicative of 
the range of differences among the analysis 
signals 

3. The features calculated by the software are not 
appropriate for analyzing these data 
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Inspection of the data indicates a fourth possible 
source of error: there are too many data points per sig- 
nal; that is, there is too much extraneous information 
in the data. Each of these problems is discussed. Other 
effects, including mode conversion filtration and dis- 
tortion of the original stress wave resulting from crack 
growth, the frequency response of the measurement 
system being too low to capture this wave, and edge 
reflection interferences, are also possible factors in the 
inability to use these methods. 

The signals in the analysis data were chosen 
because they have the same visual characteristics as 
those in the training set. However, the statistical char- 
acteristics of the feature set are used for training and 
analysis. Poor agreement between training and analy- 
sis results indicates that the signals are statistically 
different. 

Training the pattern recognition system requires a 
data set of sufficient size to analyze statistical differ- 
ences in the data. The software recommends training 
sets of 10 or more signals. The training sets used were 
larger than this and should be of sufficient size. How- 
ever, the signals used resulted from one acoustic emis- 
sion event being detected at each of four sensors, and 
the signals change as they propagate along the plate. 
This results in signals that can have different visual, 
temporal, and statistical characteristics at each of the 
four sensors being included in the same class. There- 
fore, the training signals are possibly not truly repre- 
sentative of the variances in signals within each signal 
classification. This effect can be eliminated by using 
data only from the sensor at which the signal from 
each event was received first and only the first few 
microseconds of the recorded waveform. 

The feature set was provided by the chosen soft- 
ware. It has been used successfully to characterize 
ultrasonic signals, which have some characteristics in 
common with acoustic emission signals. However, 
there are significant differences that may render these 
features inappropriate for this application. Further sta- 
tistical analysis of the data may reveal other features 
that better identify the statistical differences in the 
signals. 

Each crack event during the test causes signals to 
be recorded on each of four channels. All four chan- 
nels begin recording when one channel is triggered. 


and some pretrigger data are also stored with the sig- 
nal. Because the sensors are at different distances from 
the crack, the data on each channel include a varying 
amount of signal acquired before the crack signal 
reaches the sensor. The latter portions also show the 
effects of attenuation and dispersion before reception 
at the sensor. Gorman and Prosser (ref. 8) have shown 
that, for in-plane sources such as crack extension, the 
modal information indicative of extensional waves is 
in the first several microseconds of the signal. The lat- 
ter part of the signal is dominated by reflections. The 
velocity of the extensional wave mode in 2024-T4 alu- 
minum is 5380 m/sec. If the crack is 7.5 cm from the 
edge of the specimen, reflections of the original signal 
will return to the crack position within about 28 psec. 
They would reach a transducer between the crack and 
the edge of the specimen even earlier. Thus, most of 
the information in the signals after the first 1 0 psec or 
so is heavily affected by reflections and artifacts of 
geometry. Eliminating the pretrigger portion of the 
signal, and all but the first 10 psec of the remaining 
signal, should focus on the extensional wave and elim- 
inate much of the variation caused by reflections. Any 
attempt to using pattern recognition to classify acous- 
tic emission signals as to their source must take into 
account that the signals are heavily affected by mate- 
rial properties and geometry. The other effects men- 
tioned require additional experimentation to determine 
their relevance to the classification of these signals. 

Concluding Remarks 

In a laboratory fatigue test, TestPro software was 
unable to learn to classify acoustic emission signals 
from cracks with less than 9 percent classification 
error. This classification error may be acceptable in 
applications where multiple cracks, or very long 
cracks, can be tolerated. In applications where detec- 
tion of small cracks, or small numbers of cracks, is 
critical, this classification error level is likely to be 
unacceptable. Further, where additional acoustic emis- 
sion signals are generated from other sources, the clas- 
sifiers developed may not be adequate to identify the 
signals from cracks. Further preprocessing of the 
acoustic emission signals may allow the software to 
classify the signals with greater accuracy. A different 
set of features that more accurately represents the dif- 
ferences observed in the signals may also give better 
accuracy. 
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Appendix A 

Statistical Pattern Recognition 

Pattern recognition approaches can be classified as 
either syntactic or statistical. With syntactic methods, 
the observations or signals to be analyzed are broken 
down into smaller parts, the way a language or sen- 
tence is parsed. The relationships between the parts 
are analyzed in a way similar to the ways that syntax 
rules express the relationships between parts of 
speech. These methods are used when a pattern is so 
complex that it is best analyzed as a composition of 
simpler subpattems, as in fingerprint or scene analysis 
(ref. 9). Statistical methods, however, rely on mathe- 
matical models of the observations to be analyzed and 
the relationships among them. A set of measurements, 
or features, is extracted from each observation. These 
features should be invariant, or less sensitive to com- 
monly encountered variations and distortions, and less 
redundant, than the observations themselves. These 
methods have been applied to waveform classification 
as summarized by Fukunaga (ref. 7) upon which the 
following discussion is based. 

Statistical pattern recognition consists of, first, 
representing each observation as a vector in 
n-dimensional space, where each dimension n is a fea- 
ture used to characterize each observation. Several 
such observations, represented by their vectors, form a 
distribution in feature space. Each distribution can be 
approximated by some probability density function, 
which expresses the likelihood that a vector which lies 
within the contour of the function belongs to that dis- 
tribution. The boundaries which separate these distri- 
butions must be determined and expressed as 
mathematical functions, which are known as discrimi- 
nant functions. Once these discriminant functions are 
determined, a pattern recognition network, or classi- 
fier, analyzes a given vector and determines to which 
distribution it belongs. The process of finding the 
proper discriminant function is called learning or 
training; the samples used to design the classifier com- 
prise the training set. 

For simplicity in discussing classifier design, con- 
sider the case of two distributions or classes. Ideally 
these two classes are totally distinct and separate in 
feature space with no overlap. In this case, the training 


error is zero, and designing a classifier requires only 
consideration of the region in feature space between 
the classes. One can develop a linear classifier by 
drawing a line bisecting and perpendicular to a line 
connecting the means of the two classes. This process 
gives a simple method for classifying observations 
that fall on either side of the line. Observations that 
fall directly on the line can be classified randomly or 
rejected. 

In the more general case, the classes are not totally 
distinct and separate in feature space, but do overlap; 
this results in training error. The classifier must be 
designed to minimize the error associated with obser- 
vations in the overlap region. Let X be a random 
^-dimensional vector, as discussed in Papoulis 
(ref. 10), whose components are features representing 
a test sample, that is, an observation to be classified. In 
figure Al, CO! and CO 2 are two classes in feature space. 
We define a linear discriminant function h(x) as: 

t “i 

M*) = v T x + v 0 > 0 (l) 

co 2 

The vector X is projected onto a vector V, whose 
transpose is V , and the variable y = V X in the pro- 
jected one-dimensional h - space is classified to either 
C0j or co 2 depending on whether h(x) < 0 or h(x) > 0. 
Figure A1 shows two possible choices of V and the 



Figure Al. Example of linear mapping, showing two 
classes, G>] and 0 ) 2 » mapped onto vectors V and V' with 

errors and Vq. (From ref. 7 (used with permission).) 



corresponding choices of v Q . The optimum classifier 
selects the values of V and v which give the smallest 
error in the projected /i-space. The Fisher criterion / 
for determining the optimum V and v Q is 


/ = 


Oil -Tl 2 )" 


a, + a 2 


( 2 ) 


2 2 

where t| j , r| 2 , a,, and a 2 are the means and vari- 
ances, respectively, of the classes to, and a ) 2 , and / 
measures the differences of the two means normalized 
by, the average variance. The means T|j and variances 
c j can be expressed in terms of V and v 0 as 


= V T M,+ v 0 

(3) 

2 T 

: 2 = V T X.V 

(4) 


where 


£,• covariance matrix of co, 

M,- expected vector or mean of X, 

Substituting equations (3) and (4) into equation (2), 
differentiating with respect to V and v 0 , and setting the 
derivative equal to zero yield V with the minimum 
error as follows: 


- [ 2 ^ 1 + 2^ 2 (M 2 -M,) (5) 


Substituting equation (5) into equation (1) yields the 
Fisher linear discriminant function h F (x), 


sonable in many applications involving signal detec- 
tion where the noise is random and does not change 
from one signal to another. 

The random vector X, with n variables 
|x, x 2 ... » * s tbe input to the pattern recogni- 

tion network. It is a property of a random vector that it 
can be characterized by a probability distribution 
function P(X); 

p ( x v -> x „) = Pr(x 1 ^x,,...x w <x n ) (7) 

which may also be written 

P(X) = Pr{X < X} (8) 

where Pr{A} is the probability of an event A, and X is 
a given vector. It is also a property of a random vector 
that it can be characterized by a density function p(X), 
the derivative of the distribution function, 


pX = lim 

Ax, ->0 


Pr{ X l <*!<*! +Ajc l>” - x « <x n ^ x n + Ax „} 


Ax l- Ax „ 


Ax „^o 


d n px 

djtj 


(9) 


denoting differentiation of the distribution function 
with respect to each of the components of the vector 
X. The density function p(X) is not a probability, but 
must be multiplied by a region AX to obtain a 
probability. An explicit expression of p(X) for a nor- 
mal distribution is 


h F (x) 




x (X + v Q ) 


5 * 

< 

to. 


0 


( 6 ) 


NJM, Z) = 


(2jc)” /2 |X| 1/2 


exp[~ /(X)] (10) 


where is a normal distribution with expected 

vector M and covariance matrix £, and 


d 2 (X) = 2 
1 = 1 


X h if x \- m ^ x r m j) (u) 
j = 1 


Linear discriminant functions are optimum only for 
normal distributions with equal covariance matrices. 
The assumption of equal covariance matrices is rea- 
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where hy is the ij component of X' 1 , the inverse cova- 
riance matrix, and is the expected value or mean of 
x The coefficient (27t) -n/2 IZr 1/2 is selected to satisfy 
the probability condition 

jp(X)dX=\ (12) 

Equation (10) expresses the probability that a given 
vector X is a member of the class defined by the nor- 
mal distribution N. The Gaussian probability density 


function classifier assigns the test sample to the class 
for which this function is maximum. 

In the k-nearest neighbor approach, the k nearest 
neighbors (kNN’s) of a test sample are selected from 
the mixture of classes in feature space, and the number 
of neighbors from each class among the k selected 
samples is counted. The test sample is then classified 
to the class represented by a majority of the kNN’s. 
Ties can be broken at random or rejected and not 
classified (ref. 11). 
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Appendix B 
TestPro Software 

TestPro software by Infometrics, Inc., is a 
computer-based instrument for ultrasonic and eddy- 
current inspection. The software incorporates data 
acquisition and analysis routines into a package spe- 
cifically tailored for these applications. The feature 
extraction and pattern recognition modules use stan- 
dard statistical algorithms; however, the selection of 
features to extract from the signals is specifically cho- 
sen to be applicable to ultrasonic and eddy-current sig- 
nals commonly encountered in nondestructive 
evaluation. Acoustic emission signals bear some simi- 
larity to ultrasonic signals, particularly when ultra- 
sonic sensors are used for their detection. They are 
very different, however, in that they are generated by 
physical and mechanical phenomena in a material or 
structure, whereas ultrasonic signals are applied to a 
structure which then interacts with and modifies the 
signals. Although the TestPro software was developed 
specifically for ultrasonic and eddy-current analysis, it 
was used here to determine its applicability to the 
study of acoustic emission signals. 

Feature Extraction 

TestPro software preprocesses each waveform, 
then calculates 71 features, 35 from the time domain 
signal and 36 from the frequency domain, as listed in 
table Bl. Preprocessing consists of subtracting the 
mean value of the waveform data from each point. 
This process minimizes the direct-current (dc) compo- 
nent in the frequency domain resulting from the fast 
Fourier transform (FFT), but this does not necessarily 
result in the endpoints of the signal being zero. Since 
nonzero endpoints can cause spurious high frequency 
components to appear in the power spectrum, it is 
desirable to force the endpoints to zero. This forcing is 
accomplished by multiplying the first and last eight 
points of the signal by a cosine function. The number 
of data points is increased to the next power of 2 and 
padded with zeros to perform the FFT. 

The time domain features are extracted from the 
waveform, the cumulative distribution of the wave- 
form, and the envelope of the waveform. The wave- 
form features are maximum absolute value of the 
amplitude, or peak amplitude, and maximum peak-to- 
peak amplitude. The waveform is then normalized by 
dividing all amplitude values by the peak amplitude, 


resulting in an amplitude range from -0.1 to 0.1. 
Because the mean value of the waveform was sub- 
tracted, the resulting mean is 0; the standard deviation 
of the normalized amplitude values is calculated and 
stored as a waveform feature. 

The cumulative distribution of the normalized 
signal is calculated by computing a running sum of 
squares of the signal amplitude versus time. The final 
value of the running sum is equal to the total power of 
the signal. The cumulative distribution is analyzed to 
determine the points in time where the distribution 
crosses 25, 50, 75, and 90 percent of the total power. 
The differences between the 50- and 25-percent levels, 
the 75- and 25-percent levels, and the 90- and 
25-percent levels are added to the feature set. 

The envelope of the signal is determined by apply- 
ing a smoothing function to the positive amplitude 
peaks of the signal. It approximates a numeric integra- 
tion of the waveform. The resulting envelope is nor- 
malized by dividing by the peak amplitude, and the 
mean and standard deviation are computed and 
included in the feature set. The remaining time domain 
features are measured from rise and fall time charac- 
teristics of the envelope. Rise and fall times are deter- 
mined at points where the envelope crosses thresholds 
of 25, 50, and 75 percent of the peak amplitude. Local 
rise and fall times are those times at which the thresh- 
old crossing is nearest the maximum value of the 
envelope; global rise and fall times are those at which 
the threshold crossing is farthest from the peak. Rise 
and fall slopes indicate how fast the envelope function 
rises or falls; rise and fall variances indicate the varia- 
tion of amplitude values between the thresholds and 
the peak. To calculate the slopes and variances, 
TestPro software performs a linear least-squares 
regression on the data points between each threshold 
crossing and the peak amplitude. Global and local 
pulse durations are calculated by subtracting the corre- 
sponding rise and fall times. 

The frequency domain features are measured from 
the power spectrum of the normalized waveform and 
the cumulative distribution of the power spectrum. 
The FFT is calculated and the squares of the real and 
imaginary components are summed to generate a 
power spectrum, which is then normalized by the 
power level. The mean and standard deviation of the 
normalized power spectrum are calculated and 
included in the feature set. 


10 



The frequency at which the maximum values of 
the power spectrum occurs is located. The local 
50-percent rise and fall frequencies are the half-power 
points closest to the frequency of the peak power. The 
center frequency is defined as the average of the local 
50- percent rise and fall frequencies. The bandwidth is 
the difference of these two frequencies divided by the 
frequency of the peak and expressed as a percentage. 
Local and global spectral features are determined in a 
manner similar to the local and global time domain 
features described earlier. Fractions of total power 
estimates are measured by computing the power con- 
tributions over the relevant frequency intervals as 
specified in table B1 (features 44^17), then dividing 
by the power contribution between the local rise and 
fall frequencies at 25 percent of the peak power. The 
remaining frequency domain features are analogous to 
those measured from the envelope function in the time 
domain. 

Feature Selection 

TestPro software uses a k-nearest neighbor algo- 
rithm to analyze the waveform features and to learn to 
distinguish signals from different classes. This learn- 
ing requires a set of known signals for each of the 
classes. The value of k used for learning is the square 
root of the number of signals in the smallest set of the 
training data. 

TestPro software first attempts to classify the sig- 
nals using each feature individually. For each wave- 
form in the database, its k nearest neighbors are 
identified by using minimum distance in a single 
dimension. Using the class value of the majority of the 
k nearest neighbors, a class call for the waveform is 
determined. If this class call is not the same as the 
given class of the waveform, an error counter is incre- 
mented. This process is repeated for all waveforms in 
the training set for the single feature being analyzed; 
this results in an estimate of the classification error 
using the single feature. This process is repeated to 
obtain a single error estimate for each feature. The fea- 
ture with the minimum single error is selected as the 
optimum feature. The entire process is repeated to 
determine the second optimum feature. The nearest 
neighbor criterion now involves computation of a two- 
dimensional distance to determine the k nearest neigh- 
bors, where the first dimension is the first optimum 
feature and the second is the feature being analyzed. 
The error analysis is again performed for each feature, 


and the feature with the minimum error is added to the 
set of optimum features. This process is repeated, with 
the distance determination expanding to multiple 
dimensions until either the number of optimum fea- 
tures equals 1 0, adding another feature to the optimum 
set results in no further reduction of the overall error is 
achieved. 

TestPro software then allows several classifiers or 
discriminants to be developed to be used for analyzing 
unknown signals. These are the Gaussian probability 
density function, a Fisher linear discriminant, and 
k-nearest neighbor nonlinear discriminant function, 
where k ranges from 1 to 20. 

Waveform Analysis 

Waveform analysis is the process of classifying 
unknown signals. Each classifier, or discriminant 
function, is used to determine the probability of each 
unknown waveform belonging to each of the classes 
defined in the learning process. The total probability 
sums to 100 percent over all the classes for each 
waveform. 

Each classifier uses some measure of the distance 
between the feature values of the waveform being ana- 
lyzed and the mean values of the features used in train- 
ing to determine the class probabilities. A confidence 
level is also given as an indication of how closely the 
evaluation point fits the mean values of the training 
data. Each feature is scaled by subtracting the mean 
value of the training set and dividing by its standard 
deviation. This value represents the distance between 
the feature being evaluated and the mean of the train- 
ing set in standard deviations. This distance is deter- 
mined for each of the defined classes and converted to 
a qualitative confidence level. If the difference is less 
than or equal to two standard deviations (2a), the con- 
fidence level is high. A difference greater than 2a and 
less than or equal to 3a is a medium confidence level. 
A difference greater than 3a is a low confidence level. 
The confidence level of the minimum difference is 
assigned to the feature being evaluated. 

This process is repeated for each additional feature 
in the optimum feature set. An overall confidence 
level is determined by selecting the maximum of the 
sealed differences for each feature and converting it to 
a confidence level. 
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Table B 1 . Waveform Features Calculated by TestPro Software 


Feature 

Description 

Radio frequency (RF) waveform 

1 

Maximum absolute amplitude of RF waveform 

2 

Maximum peak-to-peak amplitude of RF waveform 

3 

Mean value of normalized RF waveform amplitude values 

4 

Variance of normalized RF waveform amplitude values 

RF waveform cumulative distribution (CD) 

5 

Difference between 50- and 25-percent level (RF waveform CD) 

6 

Difference between 75- and 25-percent level (RF waveform CD) 

7 

Difference between 90- and 25-percent level (RF waveform CD) 

RF waveform envelope function 

8 

Local pulse duration between 25-percent levels 

9 

Global pulse duration between 25 -percent levels 

10 

Mean value of normalized envelope function 

11 

Variance of normalized envelope function 

12 

Local rise time from 25 -percent level to peak 

13 

Local rise time from 50-percent level to peak 

14 

Local fall time from peak to 25-percent level 

15 

Local fall time from peak to 50-percent level 

16 

Local rise slope between 25-percent level and peak 

17 

Local rise variance between 25-percent level and peak 

18 

Local rise slope between 50-percent level and peak 

19 

Local rise variance between 50-percent level and peak 

20 

Local fall slope between peak and 25-percent level 

21 

Local fall variance between peak and 25-percent level 

22 

Local fall slope between peak and 50-percent level 

23 

Local fall variance between peak and 50-percent level 

24 

Global rise time from 25-percent level to peak 

25 

Global rise time from 50-percent level to peak 

26 

Global fall time from peak to 25-percent level 

27 

Global fall time from peak to 50-percent level 

28 

Global rise slope between 25-percent level and peak 

29 

Global rise variance between 25-percent level and peak 

30 

Global rise slope between 50-percent level and peak 

31 

Global rise variance between 50-percent level and peak 

32 

Global fall slope between peak and 25-percent level 

33 

Global fall variance between peak and 25-percent level 

34 

Global fall slope between peak and 50-percent level 

35 

Global fall variance between peak and 50-percent level 

Spectrum cumulative distribution 

36 

Difference between 25- and 50-percent level (spectrum CD) 

37 

Difference between 25- and 75-percent level (spectrum CD) 

38 

Difference between 25- and 90-percent level (spectrum CD) 
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Table B 1 . Concluded 


Feature 

Description 

Power spectrum 

39 

Frequency of maximum value of power spectrum 

40 

Center frequency of power spectrum 

41 

Measured bandwidth 

42 

Mean value of normalized power spectrum 

43 

Variance of normalized power spectrum 

44 

Fraction of total power between lower 25-percent level and peak 

45 

Fraction of total power between lower 50-percent level and peak 

46 

Fraction of total power between peak and upper 25 -percent level 

47 

Fraction of total power between peak and upper 50-percent level 

48 

Local rise frequency from 25 -percent level to peak 

49 

Local rise frequency from 50-percent level to peak 

50 

Local fall frequency from peak to 25 -percent level 

51 

Local fall frequency from peak to 50-percent level 

52 

Local rise slope between 25-percent level and peak of spectrum 

53 

Local rise variance between 25-percent level and peak of spectrum 

54 

Local rise slope between 50-percent level and peak of spectrum 

55 

Local rise variance between 50-percent level and peak of spectrum 

56 

Local fall slope between peak of spectrum and 25-percent level 

57 

Local fall variance between peak of spectrum and 25-percent level 

58 

Local fall slope between peak of spectrum and 50-percent level 

59 

Local fall variance between peak of spectrum and 50-percent level 

60 

Global rise frequency between 25-percent level and peak of spectrum 

61 

Global rise frequency between 50-percent level and peak of spectrum 

62 

Global fall frequency between peak of spectrum and 25-percent level 

63 

Global fall frequency between peak of spectrum and 50-percent level 

64 

Global rise slope between 25-percent level and peak of spectrum 

65 

Global rise variance between 25-percent level and peak of spectrum 

66 

Global rise slope between 50-percent level and peak of spectrum 

67 

Global rise variance between 50-percent level and peak of spectrum 

68 

Global fall slope between peak of spectrum and 25-percent level 

69 

Global fall variance between peak of spectrum and 25-percent level 

70 

Global fall slope between peak of spectrum and 50-percent level 

71 

Global fall variance between peak of spectrum and 50-percent level 
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